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Abstract 

Background: Thousands of intragenic long interspersed element 1 sequences (LINE-1 elements or Lis) reside 
within genes. These intragenic L1 sequences are conserved and regulate the expression of their host genes. When 
LI methylation is decreased, either through chemical induction or in cancer, the intragenic LI transcription is 
increased. The resulting LI mRNAs form RISC complexes with pre-mRNA to degrade the complementary mRNA. In 
this study, we screened for genes that are involved in intragenic LI regulation networks. 

Results: Genes containing Lis were obtained from L1 Base (httpy'/l 1base.molgen.mpg.de). The expression profiles of 
205 genes in 516 gene knockdown experiments were obtained from the Gene Expression Omnibus (GEO) (http://www. 
ncbi.nlm.nih.gov/geo). The expression levels of the genes with and without Lis were compared using Pearson's 
chi-squared test. After a permutation based statistical analysis and a multiple hypothesis testing, 73 genes were found 
to induce significant regulatory changes (upregulation and/or downregulation) in genes with Lis. In detail, 5 genes 
were found to induce both the upregulation and downregulation of genes with L1 s, whereas 27 and 37 genes 
induced the downregulation and upregulation, respectively, of genes with Lis. These regulations sometimes differed 
depending on the cell type and the orientation of the intragenic Lis. Moreover, the siRNA-regulating genes containing 
L1 s possess a variety of molecular functions, are responsible for many cellular phenotypes and are associated with a 
number of diseases. 

Conclusions: Cells use intragenic Lis as c/s-regulatory elements within gene bodies to modulate gene expression. 
There may be several mechanisms by which Lis mediate gene expression. Intragenic Lis may be involved in the 
regulation of several biological processes, including DNA damage and repair, inflammation, immune function, 
embryogenesis, cell differentiation, cellular response to external stimuli and hormonal responses. Furthermore, in 
addition to cancer, intragenic L1 s may alter gene expression in a variety of diseases and abnormalities. 

Keywords: Long interspersed element-1, c/s-regulatory function, Expression profiling arrays, Intragenic LINE-1 s, 
Gene body regulation, Regulatory network, LINE-1 methylation 



Background 

Long interspersed element-1 sequences (LINE-1 ele- 
ments or Lis) are broadly distributed throughout the 
human genome and are thought to have no physiological 
function in cells [1]. However, we recently demonstrated 
that intragenic Lis, which are Lis located inside a gene 
body, particularly in an intron, may regulate gene ex- 
pression [2]. First, compared with intergenic Lis, human 
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intragenic Lis contain conserved CpG dinucleotides at 
the 5' UTR and sequences that are important for LI tran- 
scription. Second, genes containing Lis are frequently 
downregulated in cancer and hypomethylated normal 
cells. Third, genes containing Lis are upregulated in 
argonaute 2 (AG 02)- downregulated embryonic kidney 
cell lines. LI hypomethylation, both in cancer and as a re- 
sult of chemical induction, increases the quantity of intra- 
genic LIRNAs. The L1RNA forms a complex with its host 
gene pre-mRNA and AG02. Consequently, the mRNA 
levels of genes containing Lis are repressed in cancer [2]. 
In this study, we explored whether the knockdown of 
other genes, in addition to AG02, alters the mRNA level 
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of genes with Lis. The resulting information should lead 
to interesting hypotheses of how intragenic Lis regulate 
gene expression and might indicate the biological pro- 
cesses in which this regulation is involved. 

There are more than 500,000 copies of Lis in the hu- 
man genome [3]. Most LI elements are truncated at ei- 
ther the 5' region or the 3' region [4]. The full length of 
a putatively active LI is approximately 6,000 base pairs, 
including a 5' untranslated region (5'UTR) containing 
promoters, open reading frame 1 (ORF1), open reading 
frame 2 (ORF2) and a polyadenylation site in the 3' un- 
translated region (3'UTR) [5]. In this study, we analyzed 
more than 10,000 Lis from the LIBase database (http:// 
linel.bioapps.biozentrum.uni-wuerzburg.de/llbase.php) [4] 
that are longer than 4.5 kb and contain a 5' UTR. More 
than 2,000 of these Lis are intragenic and reside within 
more than 1,000 genes. 

Although Lis are considered "junk" DNA [6], there 
are several lines of evidence that support a role for intra- 
genic Lis as cw-regulatory elements that play a crucial 
role in cell differentiation and the maintenance of nor- 
mal cellular function. Lower LI methylation levels are 
associated with reduced expression levels of the genes 
containing these Lis [2]. The methylation levels of intra- 
genic Lis are tissue specific [7]. Consequently, one of 
the mechanisms that lead to different gene expression 
levels in different tissues may be a consequence of the 
epigenetic modification of intragenic Lis [8,9]. 

There are several mechanisms by which Lis may regu- 
late gene expression. Most of the known LI -related gene 
regulatory mechanisms are mediated by L1RNA. One of 
these mechanisms is involved in X-inactivation; during 
this process, LI mRNA forms facultative heterochroma- 
tin in the inactivated region [10-12]. An antisense 5' LI 
promoter can transcribe RNA from antisense DNA se- 
quences at the 5' end of LI [13-16]. Moreover, the tran- 
scription from LI can extend beyond the LI poly-A 
sequence and produce RNA from unique DNA se- 
quences that exist beyond the 3' end of LI [13,17]. In 
addition, we proved the intragenic Lis regulate the ex- 
pression of genes containing these Lis [2]. Furthermore, 
there are other mechanisms by which interspersed re- 
petitive sequences can regulate genes. For example, Alu 
can mediate alternative splicing, and the LTR of human 
endogenous retrovirus has been reported to possess en- 
hancer function [18-21]. 

Recently, we compared the regulated mRNA levels of 
genes with Lis and those of genes without Lis through 
Pearson's chi-squared analyses using an expression 
microarray of AG02-knockdown cells [2]. We found 
that genes containing Lis were significantly upregulated 
in j4G02-knockdown cells. These data prompted us to 
further explore whether AG02 plays a role in the con- 
trol of gene expression through intragenic Lis [2]. 



In this study, we used publicly available data obtained 
from online sources to screen for genes that interact 
with intragenic Lis to control gene expression. In par- 
ticular, we extracted the expression profiles obtained by 
gene knockdown experiments from the Gene Expression 
Omnibus repository (GEO datasets: http://www.ncbi. 
nlm.nih.gov/gds) [22,23] and acquired information re- 
garding intragenic Lis from the LIBase database (http:// 
linel.bioapps.biozentrum.uni-wuerzburg.de/llbase.php) 
[4]. The Connection Up- and Down-Regulation Expres- 
sion Analysis of Microarrays (CU-DREAM) software 
package (http://pioneer.netserv.chula.ac.th/~achatcha/ cu- 
dream/) [24] was used to perform various statistical ana- 
lyses, including Student's t-test and Pearson's chi-squared 
test, to analyze the gene regulatory functions of intragenic 
Lis. We found that many genes regulate the genome- 
wide mRNA expression through regulatory networks of 
intragenic Lis. Therefore, intragenic Lis may serve as cis- 
regulatory elements that mediate gene expression in a 
variety of normal biological conditions and diseases. 

Methods 

Data collection and template preparation 

Using "siRNA", "shRNA" and "gene knockdown" as key- 
words, the expression profiles from microarray experi- 
ments that are related to the topics of interest and were 
performed between March 2005 and October 2011 were 
obtained (Additional file 1: Table SI). All of the supple- 
mentary information, including series matrix files and re- 
lated platforms, which was freely available from the Gene 
Expression Omnibus repository (GEO datasets: http:// 
www.ncbi.nlm.nih.gov/gds) [22,23], was downloaded. Sub- 
sequently, all of the GEO sample numbers (GSMs) were 
extracted for template preparation. In the template 
preparation process, the "control" samples included the 
samples that were labeled "scramble shRNA", "mock ex- 
periment" and "shRNA or siRNA of reporter gene", 
whereas the samples that were labeled "shRNA or siRNA 
of gene" were defined as "experimental". Moreover, the 
threshold parameter was set to a significance level of 0.01 
for each regulation. 

LI library preparation 

The putative LI elements in the human genome were col- 
lected from LIBase (http://linel.bioapps.biozentrum.uni- 
wuerzburg.de/llbase.php) [4]. Only intragenic Lis were 
selected, and their host genes were compiled in a library, 
as described in a previous study [2]. In this study, 3 cate- 
gories of intragenic LI were used to build the library. First, 
the genes containing all types of intragenic Lis formed 
"Intragenic LI Library". Second, the genes containing the 
sense strand of intragenic Lis were included in "Sense LI 
Library". Third, the "Antisense LI Library" included those 
genes that contained the antisense strand of intragenic 
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Lis. These libraries were primarily used for the extension 
programming of the CU-DREAM software package. 

Statistical analysis 

First, the mRNA levels of the experimental and control 
samples were evaluated. Using the prepared templates of 
the microarrays, series matrix files and platforms, Stu- 
dent's t-test was performed for each gene to compare 
the means of the control and experimental groups of the 
examined experiments. Each gene was then determined 
to be downregulated or upregulated based on the obtained 
p-value. Subsequently, the distributions of upregulated 
and downregulated genes were evaluated using Pearson's 
chi-squared test to determine whether the distributions 
were dependent on the presence of an intragenic LI. 
Genes were classified into four groups, A through D. The 
significant genes with intragenic Lis that were down- 
regulated or upregulated were included in group A. The 
significant genes without intragenic Lis were included in 
group B, whereas the non-significant genes with intragenic 
Lis were included in group C. The remaining genes (non- 
significant genes without Lis) were included in group D. 
The values of odds ratio (OR), p-values, and lower and 
upper 95% confidence interval (CI) of the genes in groups 
A through D were displayed in an MS Excel format. All of 
the statistical analyses were performed using extensions in 
the CU-DREAM software (http://pioneer.netserv.chula.ac. 
th/~achatcha/cu-dream/) [24] (Figure 1). 

We also provide the p-values from a permutation test. 
Each gene was labeled "down," "up", or "neutral." We 
randomly permuted or shuffled these labels to produce 



100,000 replicates. Multiple hypothesis testing was 
corrected through false discovery rate (FDR) analysis 
[25]. The R Statistics software with the QVALUE pack- 
age was used with the default parameters, with the ex- 
ception that "Bootstrap" was used instead of "Smoother" 
[26]. We performed the FDR analysis on 516 siRNA ex- 
periments x {Down, Up} x {LI, sense LI, antisense LI} to 
obtain 3,096 permutation p-values. With the restriction 
of the q-value to less than 0.05, the number of signifi- 
cant /^-values was found to be 230. We also obtained 
no = 0.8293, which is consistent with the low number of 
significant /J-values. 

Data analysis 

The results of the correlation of the gene knockdown 
experiments and the LI libraries included the number of 
genes in groups A through D, as well as the ORs, 95% 
CIs, p-values, permuted p-values and q-values, which 
were organized by the direction of regulation. Next, the 
assessed experiments were grouped according to the OR 
value based on the direction of regulation. Thus, this 
analysis yielded 7 groups: the downregulation of genes 
with Lis with an OR greater than 1 (and the upregulation 
of these genes with an OR less than 1); the upregulation of 
genes with Lis with an OR greater than 1 (and the 
downregulation of these genes with an OR less than 1); 
the downregulation of genes with Lis with an OR less 
than 1; the upregulation of genes with Lis with an OR less 
than 1; both the downregulation and the upregulation of 
genes with Lis with an OR greater than 1; and both the 
downregulation and the upregulation of genes with Lis 



(A) 



Microarray 




Student's t-test 




Chi-squared test 



(B) 





Up- or down- 
regulated genes of 
experiment 


Not up- or not down- 
regulated genes of 
experiment 


Genes containing LI 


Number of genes in 
the 1 st group (A) 


Number of genes in 
the 2 nd group (B) 


Genes without LI 


Number of genes in 
the 3 rd group (C) 


Number of genes in 
the 4 th group (D) 



Figure 1 The CU-DREAM extension program. (A) A flow chart illustrating how the microarray data were processed. First, the microarray data for the 
fields of interest were collected and prepared. The program then computed the status of each gene (upregulated or downregulated) using Student's 
t-test. Subsequently, the assessed genes were compared with a list of genes containing intragenic L1 s. The results showed the associations between 
the gene regulation status and the presence of an L1 sequence in terms of ORs and p-values. (B) A table indicating the intersection between each 
experimental result and the genes containing intragenic Lis. These intersections are referred to as groups A through D. Group A includes the genes 
that are upregulated or downregulated and contain intragenic L1 s. Group B includes the genes that are not upregulated or downregulated but 
contain intragenic Lis. Group C includes the genes that are upregulated or downregulated but do not contain intragenic Lis. Group D includes the 
remaining genes, which are not upregulated or downregulated and do not contain intragenic Lis. 
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Table 1 A summary of the assessed gene knockdown experiments 



Direction of expression changes for genes 
with intragenic Lis and magnitude of the OR value 


No. of 
experiments 


No. of genes 


Repeats/co-transfections 


miRNA 
experiments 


Down and Up < 1 


0 


0 


0 


0 


Down and Up > 1 


/ 


4 


1 


2 


Down < 1 


9 


10 


1 


0 


Down > 1 


40 


26 


13 


2 


Up < 1 


12 


11 


3 


0 


Up > 1 


39 


32 


8 


4 


non-significant 


409 


172 


241 


13 


Total 


516 


255 


267 


21 



with an OR less than 1. The genes that could not be classi- 
fied using these criteria were placed into a group of non- 
significant results. The official names of the identified 
genes and their functions were evaluated. 

Results 

In this study, we screened hundreds of expression micro- 
array experiments that were reported in the GEO and in- 
volved genes that were knocked down using siRNA or 
shRNA. The intragenic LI elements, which were collected 
from Llbase, contained 5'UTRs and at least 4,500 base 
pairs. The genes containing these intragenic LI elements 
were extracted to build a library of 1,454 genes. All of the 
experiments were analyzed by Pearsons chi-square test, 
multiple hypothesis testing and permutation based statis- 
tical analysis. The results were divided into 7 groups based 
on the direction of regulation (up or down) and the OR 
values. In this manuscript, we report the results of the 
analyses of 516 experiments, which represent 205 individ- 
ual gene knockdowns (Table 1). 

Two examples of Pearson's chi-squared tests are 
shown in two 2x2 tables (Tables 2 and 3). The first ex- 
ample shows a siRNA that promoted the downregulation 
of genes containing intragenic Lis, whereas the second 
shows a siRNA that promoted the upregulation of these 
genes. The genes that were downregulated in the XIAP 
knockdown experiments were compared with the list of 
genes containing intragenic Lis. Twenty-eight genes con- 
taining intragenic LI elements were also downregulated in 
the XIAP knockdown experiments. In contrast, 1,312 
genes with intragenic LI elements were not downregulated 

Table 2 A 2 x 2 table indicating the results of the XIAP 
siRNA experiment, which exhibited an OR in the 
downregulation direction 

XIAP siRNA Downregulated Not OR = 2.38 

downregulated P-value = 1 .39 x 1 0" 5 

LI 28 1,312 

NoL1 169 18,865 



in the XIAP knockdown experiments. Moreover, 169 genes 
did not contain intragenic Lis but were nevertheless 
downregulated by XIAP knockdown. The remaining 
18,865 genes were not significantly affected by XIAP 
knockdown and did not contain intragenic Lis. The OR 
was 2.38, and the p-value was 1.39 x 10" . The permuted 
p-value was 8.00 x 10" 5 , and the q-value was 2.12 x 10" 3 . 
This result implies a role for XIAP in the upregulation of 
the expression of genes containing Lis (Table 2). 

The genes that were upregulated in the IKBKAP knock- 
down experiments were intersected with genes containing 
intragenic Lis. A total of 22 genes that contained intra- 
genic Lis were upregulated in the IKBKAP knockdown 
experiments, whereas 143 genes that were upregulated in 
the IKBKAP knockdown experiments did not contain in- 
tragenic Lis. In addition, 1,315 genes contained intragenic 
Lis but were not upregulated in the IKBKAP knockdown 
experiments, and 18,976 genes were not upregulated in 
the IKBKAP knockdown experiments and did not contain 
intragenic Lis. The OR of this association was 2.22, and 
the p-value was 3.90 x 10" . The permuted p-value and the 
q-value were 9.30 x 10 4 and 1.56 x 10" 2 , respectively 
(Table 3). Therefore, IKBKAP represses many genes with 
intragenic Lis. 

The following results, which are shown in Table 1, 
Figure 2 and Additional file 1: Table SI, were obtained 
from the assessed experiments: no gene knockdowns were 
associated with the downregulation and upregulation of 
genes with Lis with an OR less than 1; 7 knockdowns were 
associated with the downregulation and upregulation of 
genes with Lis with an ORs greater than 1; 9 knockdowns 

Table 3 A 2 x 2 table indicating the results of an IKBKAP 
siRNA experiment, which exhibited an OR in the 
upregulation direction 

IKBKAP siRNA Upregulated Not OR = 2.22 

upregulated P-value = 3.90 x 1 0" 4 

L1 22 1,315 

No L1 143 18,976 



Permuted p-value = 8.00 x 1 0" 5 , q-value = 2.1 2 x 1 0" 3 



Permuted p-value = 9.30 x 1 0" 4 , q-value = 1 .56 x 1 0" 
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Non-sig: AATF, AR, AREG, ATM, 
BACH1, BAP1, BMI1, BRCA1, BTG1, 
C15orf55, CBFA2T3, CCND1, CCNT1, 
CCNT2, CDS1, CDS2, CDH1, CDK19, 
CDK4, CDK8, CDK9, CDR2, CDX2, 
CEBPB,, CIITA-BX648577, CKS1B, 
COP1, COPS5, CTNNB1, DDX39A, 
DDX39B, DEGS1, DGCR8, DICER, 
DSG2, DUOX1, EGLN2 , EGR3, EGR4, 
EIF2C2, EIF2C3, ELF5, ERBB3, ERG, 

ESR1, EJV1, EZH2, FAP, FOSL2, 
F0XA1, F0XF2, F0XM1, GAPDH, 
GATA3, GNE, GRN, H3.X, H3.Y, 
HDAC1, HDAC2, HK2, HMGB1, HNF4A, 
HNF4G, HNRNPA2B1, HOXA9, HOXC6, 
HSF1, ICT1, IGF2BP1, IKBKAP, IKBKB, 
IRF4, ITGA3, KAT8, KCNMA1, KDM2A, 

KDM3A, KDM4B, KEAP1, KISS1R, 
LSD2, MAPK1, MAPK3, MDC1 ,MDH2 

,MED26 , miR-125b, miR-16, 
miR-302c, MMP14 , MTDH , C-MYC, 
NET1, NFAT5, NFBD1, NFKB1 (plOO), 

NFKB1 (pl05), NKX2-3, NME2, 
NMNAT1, NOS2, NOX1, NUT, OPTN, 
PBX1, HF8, PIPKIa, PIR, POU5F1, 
PPARA, PPIB, PRDM14, PRKCI, 
PRMT1, PRMT5, PSF, PSIP1, QKI, 
RARA, RBM38, RBP2, RDBP, RELA, 
RFWD2, RhoGDIbeta, RPS14, RUNX1, 
SATB1, SDHB, SETD7, SMAD4, SMILE, 

SNAI1, SOXU, S0X2, SP3, SPDEF, 
SPTLC1, SPTLC 2, SPTLC3, SRA1, SRF, 

STAT1, STAT3, STAT5A, STAT5B, 
STAU1, STK33, SUZ12, TALI, TFAP2C, 

TH1L, TOPI, TP53, TP63, TRIM33, 
TUT1, UPF1, VIM, WHSC1, WT1, YY1, 
YY2, ZNF148, ZNF263, ZNF703 



DU>1: COBRA1, 
HTATSF1, miR-22, 
TH1L, WHSC2 



D<1: CD44, CTNNB1, 
BED, HSF1, MAPK1, 

MAPK3, N0X1, 
PPARA, RFWD2, SON 




D>1: AREG, BMI1, CREBZF, 
CSNK1A1, CXCR4, ELAVL1, F0XA1, 
HES6, HIF1A, HK2, HPRT1, HSF1, 
KDM4B, MDC1, miR-210, PITX2, 
PPIB, PSIP1, PTBP1, PTBP2, SRF, 
STAT1, STAT5A, STAT5B, TARDBP, 
WHSC2,XIAP 



U<1: EIF2C1, EIF2C4, HNF4A, 

MAPK1, MAPK3, PHF8, 
RFWD2, SLAMF7, SND1, TOPI, 



YAP1 



U>1: BACH1, BHLHE40, 
BMPR2, CBX4, CDK19, CDK8, 
CEBPB, EGR3, ESR1, ESRP1, 
ESRP2, ETV1, FOXA1, HES6, 
HK2, IGF2, IKBKAP, JUN, 
KDM1A, KDM4B, MED26, 
miR-101, MTDH, MYB, C-MYC, 
P0USF1, PPRC1, RDBP, 
RFWD2, SF1, STAT3, WASF3 



Figure 2 The pie chart illustrates the number of genes and the official gene names for the siRNA experiments. The results are group 
in terms of the direction of the expression change, the OR and the q-value. 



were associated with the downregulation of genes with Lis 
with an OR less than 1; 40 knockdowns were associated 
with the downregulation of genes with Lis with an OR 
greater than 1 (and the upregulation of these genes with 
an OR less than 1); 12 knockdowns were associated with 
the upregulation of genes with Lis with an OR less than 1; 
and 39 knockdowns were associated with the upregulation 
of genes with Lis with an OR greater than 1 (and the 
downregulation of these genes with an OR less than 1). In 
the GSEs that exhibited the upregulation of genes with Lis 
with significant ORs (greater than 1), the OR values indi- 
cated increases in the mRNA levels of genes containing 
Lis. In the GSEs involving the downregulation of Ll- 
containing genes with significant ORs (greater than 1), 
the OR values indicated decreases in the mRNA levels 
of genes containing Lis. 

Host genes of sense and antisense Lis were used to 
build the libraries and were analyzed by Pearson's chi- 
square test and a permutation test using all of the experi- 
ments and all of the intragenic Lis. The results revealed 



that 15 and 12 experiments promoted the downregulation 
(OR > 1) and upregulation (OR > 1), respectively, of genes 
containing sense Lis. In contrast, 4 significant groups 
contained genes with antisense Lis: 39 experiments were 
associated with an OR greater than 1 with downregulation, 
4 experiments were associated with an OR less than 1 with 
upregulation, 28 experiments were associated with an OR 
greater than 1 with upregulation, and 3 experiments were 
associated with an OR greater than 1 with downregulation 
and upregulation. Using strand-dependent intragenic Lis, 
the significant siRNA genes were categorized into 4 
groups. The first group contained 13 siRNA genes that 
were associated with genes containing sense Lis and genes 
containing antisense Lis. There were 12 siRNA genes as- 
sociated with genes containing sense Lis, whereas 41 
siRNA genes were associated with genes containing anti- 
sense Lis. The last group, which consisted of 19 siRNA 
genes, exhibited significant association only when all genes 
containing Lis were used in the analysis (Figure 3 and 
Additional file 1: Table SI). 
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LI (+.+) : BM1, CBX4, HK2, 
HTATSF1, KDM1A, MED26, 
C-MTC, PP1B. PTBP1, PTBP2, RDBP, 
THIL WHSC2 



LI <-.+1 : B ACH1. BHLHE40. CCND1, 
CDK19, CEBPB, COBRA 1, CREBZF.EGR3, 
EIF2C4, ELAVL1, ESR1, ESRP1, ETV1, 
HES6, H1FJA, HNF4A, HNF4G. HSF1, 
JUN.KATS.KDM4B, miR-101. miR-210, 
MYB, NFKBfplOO), N0X1, PHFS, PITX2, 
P0USF1, PSIP1, RAR.t, RFHV2, SND1, 
SOX2, SRF. STAT], STAT3. STATSA. 
STATSB, TARDBP.XHP 



IA(+A :A RE0. BMPR2, CDK8, 
C0P1, CSNK1A1, ESRP2, ETV1, 
F0X.il, HPRT1, KDM3A, 
miR-210, NUT 



sense + , 
antisense + 
(13) 


sense + , 
antisense - 
(12) 


sense . 
antisense + 
(41) 


sense - , 
antisense - 
(19) 



LI <-.-y. CD44. CTNNB1, CXCR4, EED, 
E1F2C1, 10F2, IKBKAP, MAPK1, MPK3, 
MDC1, MTDH, PPARA, PPRC1, SF1, 
SLAMF7. SON, TOPI, WASF3, YAP1 



Figure 3 The diagram shows the 4 groups of significant siRNA genes that are associated with strand of Lis: (1) siRNA genes that were 
associated with genes containing sense Lis or antisense Lis (+, +), (2) siRNA genes that were associated with genes containing only 
sense Lis (+, -), (3) siRNA genes that were associated with genes containing only antisense Lis (-, +) and (4) siRNA genes that were 
not associated with genes containing sense Lis or antisense Lis (-, -) but were associated with genes containing both sense and 
antisense Lis. 



Data from more than one siRNA experiment were 
available for several genes. Among these replicates, 10 
genes exhibited the same pattern of regulation, 39 genes 
were not significantly deregulated in one of the repli- 
cates, and 7 genes demonstrated opposing patterns of 
regulation. Notably, several factors differed between 
these various experiments, including the cell type, the 
oligomers used, and the treatments and treatment times 
(Table 4). 

We reviewed the genes that regulate genes with Lis, 
their molecular functions, and their association with cel- 
lular phenotypes. These genes produce several types of 
proteins, such as transcriptional factors, topoisomerase, 
histone modification, RNA elongation, signal transduc- 
tion, membrane receptors and extracellular growth factors 
(Figure 4 and Additional file 2: Table S2). The genes 



regulating genes with Lis play a role in a number of cellu- 
lar phenotypes, such as cell differentiation, cell prolifera- 
tion, hormonal response, cell homeostasis, stem cell and 
viral infection, and have been reported to be associated 
with a number of diseases, such as cancer, hormone- 
related diseases, neurodegenerative diseases, schizophre- 
nia, diabetes, and autoimmune and inflammation-related 
diseases (Figure 4 and Additional file 2: Table S2). 

Discussion 

In this study, we evaluated hundreds number of expres- 
sion profiles from gene knockdown experiments. Each 
Pearson's chi-squared test compared the gene expression 
within the same array; consequently, the variations be- 
tween different experiments did not interfere with each 
interpretation. In this study, we used FDR analysis [25] 



Table 4 Genes with more than one expression profile 


Regulation 


Genes 


Experimental differences 


non-significant and significant 


AREG, BACH I BMI1, CDKI9, CDK8, CEBPB, CTNNB1, 
CXCR4, EGR3, E5R1, ETV1, F0XA1, HK2, HNF4A, HSF1, 
IKBKAP, KDM4B, MAPK1, MAPK3, MDC1, MED26, MTDH, 
MYC, N0X1, PHF8, P0U5F1, PPARA, PPIB, PSIP1, RARA, 
RDBP, RFWD2, SRF, STAT1, STAT3, STATSA, STAT5B, THIL, TOPI 


Differences in cell types, 
oligomer sets, cell passages, 
cell treatments and times 
of transfection 


Different regulation direction 


HES6, HK2, , HSFI, MAPK1, MARKS, RFWD2, WHSC2, 


Differences in cell types, 
oligomer sets, cell passages, 
platforms, cell treatments and 
times of transfection 


Same regulation direction 


BACHI, BMII, HES6, HK2, HNF4A, HTATSF1, KDM4B, PTBP1, PTBP2, XIAP 


Differences in cell types, 
oligomer sets, cell passages, 
platforms, cell treatments and 
times of transfection 
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Molecular Functions 

Transcriptional factor, enhancer binding protein, intracellular anchor protein, splicing cofactor, 
ribonuclease, RISC, Transcription elongation factor, histone tail modification (methylation, 
acetylation, ubiquitination), DNA topoisomerase, helicase-like protein, polypyrimidine tract- 
binding protein, autocrine growth factor, Ser/Thr kinase, cell-surface glycoprotein, chemokine 
receptor, hexokinase, transferase, extracellular growth factor, NADPH oxidase, Peptidyl-prolyl 
cis-trans isomerase B, E3 ubiquitin-proteinligase, cytoplasmic protein. 

< v 

Molecular and Cellular Phenotypes 

Development, cell differentiation, metabolisms of lipid, nucleotides and glucose, inflammation, 

cell proliferation, tissue specific, cell junction, sex hormone, sexual development in the 

embryo, vitamin D function, cytokine functions, immune response, hypoxia, cellular circadian 

rhythms, DNA double-stranded breaks repair, apoptosis, cell-cell interaction, cell cycle, intra 

cellular transportation, organ vascularization, cellular responses to a stimuli, generate 

superoxide in phagosomes, endoplasmic reticulum protein associated with secretory pathway, 

mitochondrial biogenesis, p53 regulatory network, rRNA synthesis, general recombination, site 

specific recombination and cytoskeleton regulation 

* - 

Associated Diseases and Abnormalities 

Cancer, Hormonal related diseases, Diabetes, Osteoporosis, Infertile, Hermaphrodites, 
Viral infection, HTV, Bepatitis B virus. Degenerative disease. Aging, Neurodegenerative 
disease, Alzheimer, Parkinsonism, Schizophrenia, Huntington disease, Autoimmune disease, 
Systemic Lupus Erythematosus, Inflammation related disease. Congenital Malformation, 
Lesch-Nyhan syndrome. Migraine and Pulmonary Hypertension, 
v y 

Figure 4 The figure shows the molecular functions, the molecular and cellular phenotypes and the associated diseases and 
abnormalities of significant siRNA genes. 



to correct the false positives by chance from multiple 
comparisons. We also performed permutation tests to 
exclude the possibility of obtaining a positive association 
with genes containing Lis by chance. Many siRNA treat- 
ments repress genes containing Lis. In contrast, several 
siRNA treatments increased the mRNA levels of genes 
with Lis. However, different results were obtained when 
the same genes were knocked down in different cell 
types, which indicated that several factors, some of 
which are tissue specific, are involved in intragenic Ll- 
associated gene regulation mechanisms. 

Numerous evidences support the hypothesis that there 
are several mechanisms by which intragenic Lis to serve 
as «'s-regulatory elements. First, the orientation of intra- 
genic Lis influences the intragenic Ll-associated gene 
regulation for each siRNA experiment differently. Some 
genes regulate genes containing Lis only when the LI 
orientations are sense or antisense only, whereas some 
genes demonstrated significant results regardless of the 
direction. Second, some siRNAs resulted in significant 
upregulation and downregulation. These results suggest 
that the intragenic LI isoform changes or that some 
genes possess at least two different LI regulation mecha- 
nisms: one mechanism promotes certain loci gene ex- 
pression, and the other mechanism suppresses other 
genes with Lis. The 73 genes that were found to signifi- 
cantly regulate genes containing Lis possess a wide 



variety of functions. These genes produce transcriptional 
factors, enhancer binding proteins, topoisomerases, 
DNA double-strand break repair proteins, histone modi- 
fication proteins, ribonucleases, RNA elongation factors, 
signal transduction proteins, membrane proteins and 
even extracellular proteins. Some of these genes may 
directly regulate genes containing Lis, which suggests 
multiple gene regulation mechanisms. 

We recently reported the results of a Pearson's chi- 
squared test that showed the role of AG02 on the regula- 
tion of Ll-containing genes. We confirmed the presence 
of the AG02-pre mRNA-Ll RNA complex by immuno- 
precipitation assay [2]. In this study, we screened hun- 
dreds of genes to identify regulated Ll-containing genes. 
Further experiments are required to confirm the effect 
and to define the mechanism by which individual gene 
knockdowns regulate the LI expression. 

Because changes in the expression of genes containing 
intragenic LI sequences were found as a result of the 
knockdown of genes that affect a variety of cellular phe- 
notypes or diseases, we hypothesize that the regulated 
Ll-containing genes may be present in a wide range of 
biological processes, including diseases and abnormality. 
The 73 genes that were identified have the following 
functions: control of cell differentiation, cell prolifera- 
tion, hormonal response, cell homeostasis, stem cell, im- 
mune response, genomic stability and viral infection. 
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Furthermore, deregulations of these significant genes 
were associated with many diseases in addition to can- 
cer, such as hormone-related diseases, neurodegenerative 
diseases, schizophrenia, diabetes, and autoimmune and 
inflammation-related diseases. 

Conclusions 

Our study indicated an association between intragenic 
Lis and many genes that are mediators of genome-wide 
regulation. Therefore, Lis act as ds-regulatory elements. 
There are a number of genes that regulate genes with 
Lis. These regulatory genes possess a variety of molecu- 
lar functions. This result suggests multiple regulatory 
mechanisms of gene control by intragenic Lis. Further- 
more, based on the variable functions of the regulating 
genes, intragenic Lis may mediate several cellular phe- 
notypes and are associated with the genome-wide gene 
expression observed in several diseases. 

Additional files 



Additional file 1: Table SI. Validated case and control samples and 
results from each experiment. 

Additional file 2: Table S2. Molecular functions and phenotypes of 
significant genes. 
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